Landmark Extraction: A Web Mining Approach

نویسندگان

  • Taro Tezuka
  • Katsumi Tanaka
چکیده

Landmarks play crucial roles in human geographic knowledge. There has been much work focusing on the extraction of landmarks from geographic information systems (GIS) or 3D city models. The extraction of landmarks from digital documents, however, has not been fully explored. The World Wide Web provides a rich source of region related information based on our understanding of geographic space. Web mining enables a new mean of extracting landmarks, differently from conventional vision oriented methods. Our approach is based on how geographic objects are expressed by humans, instead of how they are observed. We extend existing methods of text mining so that spatial context is considered. The results of the experiments showed that adopting spatial context into text mining improves the precision of extracting landmarks from web documents.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

High Fuzzy Utility Based Frequent Patterns Mining Approach for Mobile Web Services Sequences

Nowadays high fuzzy utility based pattern mining is an emerging topic in data mining. It refers to discover all patterns having a high utility meeting a user-specified minimum high utility threshold. It comprises extracting patterns which are highly accessed in mobile web service sequences. Different from the traditional fuzzy approach, high fuzzy utility mining considers not only counts of mob...

متن کامل

STALKER: Learning Wrappers for Semistructured, Web-based Information Sources

Information mediators are systems capable of providing a unified view of several information sources. Central to any mediator that accesses Web-based sources is a set of wrappers that can extract relevant information from Web pages. In this paper, we present a wrapper-induction algorithm that generates extraction rules for Web-based information sources. We introduce landmark automata, a formali...

متن کامل

Data Extraction using Content-Based Handles

In this paper, we present an approach and a visual tool, called HWrap (Handle Based Wrapper), for creating web wrappers to extract data records from web pages. In our approach, we mainly rely on the visible page content to identify data regions on a web page. In our extraction algorithm, we inspired by the way a human user scans the page content for specific data. In particular, we use text fea...

متن کامل

Automated Annotation of Landmark Images Using Community Contributed Datasets and Web Resources

A novel solution to the challenge of automatic image annotation is described. Given an image with GPS data of its location of capture, our system returns a semantically-rich annotation comprising tags which both identify the landmark in the image, and provide an interesting fact about it, e.g. “A view of the Eiffel Tower, which was built in 1889 for an international exhibition in Paris”. This e...

متن کامل

Landmark-Based Navigation and Path-Finding in Web Mining

We introduce a new method of Web relationship based on landmark identi cation and association. This allows us to improve the performance of searching relevant information in Web intelligent navigation. In order to be more convenient for users to nd and search Web information, we have developed an EMS Search Engine as opposed to the traditional search engines. The operations of EMS Search Engine...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005